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Abstract 

The communication between a multiple-antenna transmitter and multiple receivers (users) with 
either a single or multiple-antenna each can be significantly enhanced by providing the channel state 
information at the transmitter (CSIT) of the users, as this allows for scheduling, beamforming and 
multiuser multiplexing gains. The traditional view on how to enable CSIT has been as follows so far: In 
time-division duplexed (TDD) systems, uplink (UL) and downlink (DL) channel reciprocity allows the 
use of a training sequence in the UL direction, which is exploited to obtain an UL channel estimate. This 
estimate is in turn recycled in the next downlink slot. In frequency-division duplexed (FDD) systems, 
which lack the UL and DL reciprocity, the CSIT is provided via the use of a dedicated feedback link of 
limited capacity between the receivers and the transmitter. In this paper, we focus on TDD systems and 
put this classical approach in question. In particular, we show that the traditional TDD setup above fails 
to fully exploit the channel reciprocity in its true sense. In fact, we show that the system can benefit 
from a combined CSIT acquisition strategy mixing the use of limited feedback and that of a training 
sequence. This combining gives rise to a very interesting joint estimation and detection problem for 
which we propose two iterative algorithms. An outage rate based framework is also developed which 
gives the optimal resource split between training and feedback. We demonstrate the potential of this 
hybrid combining in terms of the improved CSIT quality under a global training and feedback resource 
constraint. 
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I. Introduction 

Multiple-antenna transmitters and receivers are instrumental to optimizing the performance 
of bandwidth and power limited wireless communication systems. In the downlink (DL), in 
particular, the communication between a multiple-antenna enabled base station (BS) and one or 
more users with either a single or multiple antenna each can be significantly enhanced through 
the use of scheduling, beamforming and power allocation algorithms, be it in single user or 
multi-user mode (spatial division multiplexing). To allow for beamforming and/or multi-user 
multiplexing capability, the BS transmitter must however be informed with the channel state 
information (CSI) of each of the served users [2], [3], except when the number of users reaches 
an asymptotic (large) regime in which case random opportunistic beamforming scheme can 
be exploited [4], [5]. This has motivated the proposal of many techniques for providing the 
channel state information at the transmitter (CSIT) in an efficient manner. Proposals for how 
to provide CSIT roughly fall in two categories depending upon the chosen duplexing scheme 
for the considered wireless network. In the case of time-division duplex (TDD) systems, it was 
always assumed that CSIT should exploit the reciprocity of the uplink (UL) and DL channels, 
so as to avoid the use of any resource consuming feedback channel [6], [7]. The way reciprocity 
is exploited in the current TDD systems, is through the use of a training sequence sent by the 
user on the UL, based on which the BS first builds an estimate of the UL channel which in turn 
serves as an estimate for the DL channel in the next DL slot [6]. In frequency-division duplex 
(FDD) systems, UL and DL portions of the bandwidth are normally quite apart and hence the 
channel realizations can be safely assumed to be independent of each other. This lack of channel 
reciprocity motivates instead the use of a dedicated feedback link in which the user conveys 
the information, about the estimated DL channel, back to the BS. Recently, several interesting 
strategies have been proposed for how to best use a limited feedback channel and still provide 
the BS with exploitable CSIT (see [8], [9], [10], [11] and the references therein for further 
details). 

Although in the past, the balance has weighed in the favor of FDD systems when choosing 
a duplexing scheme (in part because of heavy legacy issues in voice oriented 2G networks 
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and also because of interference management between UL and DL), current discussions in the 
standardization groups indicate an increasing level of interest for TDD for upcoming wireless 
data-access networks (e.g.WiMax, etc.), caused partly by its advantages in maintaining system 
flexibility with respect to UL and DL traffic loads, and mostly because TDD systems are seen 
as more efficient in providing the CSIT required by several MIMO DL schemes, thanks to the 
channel reciprocity. 

In this paper, we focus on the problem of CSIT acquisition in a TDD system. We take a 
step back and shed some critical light on the traditional approach above consisting in exploiting 
the channel reciprocity via the use of training sequences exclusively. In fact we show that this 
approach fails to fully exploit the channel reciprocity. The key shortcoming is as follows: when 
sending a training sequence in the UL of a traditional TDD system, the user allows the BS to 
estimate the channel by a classical channel estimator (it can be a least-square (LS) estimator or 
minimum mean square error (MMSE) based, just to name a few). However, note that the user 
itself has the knowledge of the channel coefficients (obtained during the current DL frame or 
from the DL synchronization sequence or other control signals or even from the previous DL 
frames if the channel is correlated in time) but, regretfully, does not exploit that knowledge in 
order to facilitate the CSIT acquisition by the BS. Instead, it uses this knowledge only locally. 

Interestingly, by contrast, in FDD systems, the user exploits its DL channel knowledge by 
quantizing the channel and sending the result over a dedicated feedback link (actually UL 
bandwidth is used for this feedback along with UL data transmission). In the FDD case, UL 
training is used by the BS solely for UL data detection as this UL training cannot give any direct 
information to the BS about the DL channel coefficients. 

In this paper, we point out that in TDD systems there is a unique opportunity to combine the 
advantages of both forms of CSIT acquisition. In doing so, we obtain a new CSIT acquisition 
scheme mixing the classical channel estimation using training with the quantized limited channel 
feedback of the same channel. This gives us a framework for fully utilizing the channel reciprocity 
in a TDD setup and it improves the classical trade-off between the CSIT quality and the amount 
of training/feedback resource used. We characterize the optimal CSIT acquisition structure 
under this novel framework. This hybrid CSIT acquisition setup gives rise to a very interesting 
joint estimation and detection problem for which we propose two iterative algorithms. We 
further propose a sub-optimal outage rate based approach which helps us to optimize the fixed 
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resource partitioning between training and quantized feedback phases. We adapt this optimization 
framework to use it with practical constellations like QSPK and 16-QAM. The results obtained 
confirm our intuition and clearly demonstrate the benefit of this hybrid (mix of training and 
quantized feedback) approach for upcoming TDD systems. 

In previous work, Caire et al. studied the achievable rates for multi-user MIMO DL removing 
all the assumptions of channel state information at the receiver (CSIR) and CSIT for FDD systems 
in [12]. They gave transmission schemes incorporating all the necessary training and feedback 
stages and compared achievable rates either with analog feedback or with quantized feedback. 
The reference [13] studies the decay rate of the feedback distortion versus SNR with analog and 
digital quantized feedback for FDD systems. A very recent paper [14] studies combining the 
analog and digital feedback for FDD systems. All of these works fundamentally differ from our 
work as there is no channel reciprocity in FDD systems and hence there is no point in combining 
the UL training and the quantized feedback of the DL channel. 

Some other contributions [6], [15], [16], [17] and [18] analyze the sum rate of TDD systems 
starting without any assumption of CSI but restrict the CSIT acquisition through training only. [7] 
does a comparison of TDD systems versus FDD systems in terms of CSIT acquisition accuracy. 
[19] studies the diversity-multiplexing trade-off [20] of two-way SIMO channels when TDD is 
the mode of operation. All of these references treat no-CSI TDD systems but all acquire CSIT 
through training only. According to authors' knowledge, there is no single contribution which 
exploits the combining of training and the quantized feedback in TDD systems, which we believe 
to be one of the major novelties of this work. 

The paper is structured as follows: The system model is given in section HH followed by 
the classical CSIT acquisition for FDD and TDD systems in section [TTTJ The optimal CSIT 
acquisition strategy combining training and feedback is outlined in section [IV] Two iterative 
and one non-iterative algorithms for the joint estimation and detection have been proposed in 
section |V] The simplified outage-rate based framework to optimize the resource split appears in 
section |VT] followed by its adaption for practical constellations in section IVIIl The simulation 
results have been provided in section IVIIIl followed by the conclusions and the possible future 
extensions combined in section [IX] 

Notation: E denotes statistical expectation. Lowercase letters represent scalars, boldface lower- 
case letters represent vectors, and boldface uppercase letters denote matrices. A" 1 " and A -1 denote 
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the Hermitian and the inverse of matrix A, respectively. For a vector a, ||a|| and a represent, 
respectively, its norm and unit-norm direction vector so that a = ||a||a. A Gaussian distributed 
vector a with mean m a and covariance matrix K a is represented as a ~ CM (m a , K a ). Im 
represents the identity matrix of M dimensions. 



We consider the two way communication in a cell between a single BS, equipped with M 
antennas, and a single antenna mobile user. The DL channel h E C M is assumed to be flat-fading 
with independent complex Gaussian zero-mean unit-variance entries, where C M represents the 
M-dimensional complex space. We assume block fading channel so each channel realization 
stays constant for T channel uses [21] which can be accordingly partitioned between UL and 
DL data transmissions. 

The goal of this work is to provide a reliable estimate of the DL channel to the BS, which 
in turn can be used for beamforming/precoding purposes. However we focus on the acquisition 
issue of the channel knowledge and not about its use in MIMO transmission schemes. 

In the downlink, the received signal at the user for L symbol intervals is given by 



where X dl e C LxM is the signal transmitted by the BS for L channel uses (satisfying BS power 
constraint), n dl e C L is the complex Gaussian noise with independent zero-mean unit-variance 
entries and y d i e C L is the observation sequence during this L-length interval. 

If we want to use the above DL system equation for channel estimation, for identifiability 
of M-dimensional channel at the user's side, the length of the transmitted data (the training 
sequence in this case) should be larger than M, the number of BS transmit antennas. Based upon 
the knowledge of the transmitted data X dl (the training sequence) and the observed sequence 
y dl , the user can estimate the DL channel h using various techniques. The LS estimate, denoted 
as h LS , would be [22] 



The user can make a better channel estimate using MMSE criteria, and the estimate is given by 



II. System Model and CSIR Acquisition 



Ydi = X d ih + n dl , 



(1) 




(2) 




(3) 
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III. Classical CSIT Acquisition in FDD and TDD 

We now briefly review the classical approaches for acquiring CSIT at the BS in FDD and 
TDD systems. We shall build upon the equations below in order to present our ideas later. 

A. FDD Systems 

A typical UL frame for FDD systems is shown in Fig. (Q]) where the initial Tj b channel uses 
are reserved for feedback. 



-T a 



Pure Training 



Quantized Feedback 



Data Transmission 



-TV 



-Uplink Frame- 



Fig. 1. Uplink frame structure: Total feedback length is divided between UL training and quantized feedback phases. 

For the BS to be able to decode the feedback properly (sent as UL payload), it should first 
know/estimate the UL channel (denoted as h u G C A/ ). If the user sends a normalized training 
sequence x a G C lxTa of length T a in the UL direction, the signal received at the BS for T a 
channel uses is given by 

Y a = VP h u x a + N a , (4) 

where N a G C MxTa represents the spatio-temporally white Gaussian noise with zero-mean unit- 
variance entries and Y a G C MxTa is the received signal at M antennas of the BS during this 
T a -length training interval. P represents the user's peak power constraint which is equal to the 
UL signal-to-noise-ratio (SNR) at every BS antenna due to the normalized noise variances. After 
observing Y a , the BS can make an estimate h u of the UL channel h u , knowing the training 
sequence x a . Estimation techniques like LS or MMSE as described in the previous section can 
be applied. 

In FDD systems, the mobile station obtains the DL channel estimate h from the DL frame as 
described in the previous section. If Q denotes the quantization function, then for the DL channel 
estimate h, its quantized version (the index of the closest codeword in the codebook) is given 
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by Q(h). Afterward user maps this index (sequence of bits) into a sequence of constellation 
symbols, using the mapping function denoted by S. Let the finite cardinality set of all mapped 
codewords be denoted by CB. Hence the feedback of the DL channel would be 

x q = S(Q(h)), (5) 

where x q G C lxTq is the T q dimensional row vector of the normalized constellation symbols. 
The signal received at the BS upon transmission of x q is 

Y q = y/P h u x q + N q , (6) 

where Y q and N q are M x T q matrices of the received signal and the noise respectively at M 
antennas of the BS during this explicit T q length feedback interval. So based upon the estimate 
h u of the UL channel h u and the received feedback Y q , BS tries to recover the DL channel 
feedback (quantized version, x q ) using the optimum (although relatively complex) maximum 
likelihood (ML) sequence estimation technique. 

h = argmin ||Y q - VP h u S(Q(h))\\ 2 (7) 

h 

The search space will be restricted to the codebook, hence the BS, at best, can estimate the 
quantized version of the channel. 

B. TDD Systems 

If the communication system is operating under TDD mode, DL and UL channels are recip- 
rocal, hence h u = h. So if a user transmits pilot sequence on the UL (like eq. ©), the simple 
(UL) channel estimation at the BS furnishes CSIT due to UL and DL channel reciprocity. In 
the past, this has been the classical way of getting CSIT in TDD systems [6], [7]. 

IV. Optimal Training and Feedback Combining in TDD Systems 

The classical training based CSIT acquisition for TDD systems ignores the fact that user 
knows the DL channel and the CSIT acquisition based only on the quantized feedback for FDD 
systems cannot use the channel reciprocity whereas in TDD systems both can be exploited at 
the same time. 
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We propose a novel hybrid two stage CSIT acquisition strategy which exploits the channel 
reciprocity and user's channel knowledge at the same time. We assume perfect channel knowl- 
edge at the user's side for ease of exposition^ and later, in section IVHI-Dl we show results 
removing this perfect CSIR assumption. Working under a constraint of fixed resource available 
for CSIT acquisition (Tf b channel uses and user's power constraint of P), our strategy consists 
of dividing this interval in two phases as shown in Fig. ([I]), contrary to the classical pilot 
sequence transmission. The first stage of this hybrid approach, termed as "pure training", is the 
transmission of training sequence from the user to the BS for T a channel uses and the received 
signal will be 

Y a = VP hx a + N a . (8) 
(See eq. © for the dimensions of all parameters.) 

The optimal training based estimate, denoted as h a , based upon the observed signal Y a and 
knowing x a will be 

h a = argmin | |Y a — VP hx a | | 2 (9) 

h 

The second stage, termed as "quantized feedback", consists of the transmission of quantized 
channel, already known at the user, for T g channel uses and the received signal will be 

Y q = v/PhXq + Nq, (10) 

(See eq. © for the dimensions of all parameters.) 

where x q = S(Q(h)) E CB. This equation reveals the intriguing aspect that the BS needs to 
acquire h which appears both as the channel and the transmitted feedback x q . The BS can try 
to decode only the quantized channel information based upon the knowledge of h a (obtained as 
in eq. © making use of pure training x a ) 

h q = argmin | |Y q — VP h a x q | | 2 . (11) 

x q ece 

The optimal CSIT will be obtained by the joint estimation and detection (of h and x q 
respectively) based upon the observation of Y a and Y q , knowing x a and assuming an optimal 

'in general, the CSIR quality at the users' side is much better. Firstly the DL pilots are global (they are not transmitted per 
user contrary to the UL pilots) and secondly, the BS can surely pump larger power as compared to small hand-held mobile 
devices. 
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split between the training and the quantized feedback phases (constrained as T a + T q = Tf b ). 

h = a rgmin|| [Y a Y q ] - VP h[x a S(Q(h))} || 2 (12) 

h 

The optimal solution requires a double minimization and does not seem to bear a closed form 
expression for h. 

V. Algorithms for Joint Channel Estimation and Feedback Detection 

We give three algorithms in this section which separately solve the estimation and the de- 
tection problem of the joint minimization of eq. (fT2|) . The first two algorithms are iterative 
which separately solve the estimation and detection problems and iterate till convergence. These 
algorithms have been closely inspired by [23] which proposes similar algorithms for joint blind 
estimation and detection for signal separation. We have made modifications for our requirements 
where data aided channel estimation after the initialization step and the presence of channel as 
"data" (feedback) give it its unique texture. The third algorithm is just the single-shot solution 
of the joint estimation and detection. Owing to its simplicity, it allows us to further optimize 
the resource split between training and quantized feedback in the next section. 

A. Iterative Estimation and Detection 

We describe below our algorithm. 
Step 1) Initial channel estimation based only upon pilots 

h° = argmin ||Y a — VP hx a || 2 , (13) 

h 

which is a simple least squares problem with the solution 

h^Y.xt^xt)- 1 -^. (14) 

i=l (15) 

Superscript denotes the iteration number. 

Step 2) At iteration i, do enumeration over all the codes in the codebook assuming that the 
channel h a _1 is perfectly known. 

x q = argmin ||Y q - VP h^ 1 *^ 2 (16) 

XqSCB 
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Step 3) Regenerate extended pilot sequence x ext (pilots and detected feedback) 

x ext * = [x a x q ]. Y ext = [Y a Y q ]. 
Step 4) Channel estimation based upon extended pilots (i.e. knowing x e xt*) 



= arg min 1 1 Y ext - \[P hx ext 



j||2 



h a — Y ex tX ex t^ (x ex t x ex t^ ) 



Step 5) If x q ^ Xq" 1 or h a ^ h 



% + 1 and go to Step 2. 



The final channel estimate h is the channel vector corresponding to x q in the codebook. 



(17) 

(18) 
(19) 



Theorem 1 (Convergence for Iterative Estimation and Detection Algorithm): Let h a be the es- 
timated channel and x q be the detected feedback, both at i-th iteration of the iterative estimation 



and detection algorithm. Let the residual function / ^h a , x ext ; Y ext J = ||Y ext — VP h a x ext || 2 
be selected as the descent function for this algorithm. Then there exists some j such that for 
any % > j, x q = x q and h a = h a . 

Proof: The residual descent function / ^h a , x ext ; Y ext j = 1 1 Y ext — V~P h a x ext | | 2 is clearly 
non-negative and continuous. Considering the residual function at ?-th iteration: 



f(K,- 



>-ext i - 1 ext 



|Y ext — VP h a X ext 



i\ |2 



mm 

h 



i\ |2 



c 
< 



/ 
< 



Y ext - V~P hx ext , 

Ye^-V^h^Xextf 

^-al| 2 +l|Y c 



\fp~ hi, x 



i-l&i I |2 
ql I 



|Y a - VPh^Xal 



I 2 + min ||Y a - VP h' 

X q GCB 



II 2 

q v ± "a II 



^xJI 2 



+ ||Y q - VPh 



i-l-j-l||2 
X q 



(20) 



I Y a - VP K 

lYe^-v/phirW- 1 !! 2 

Equalities d and g make use of the property of the Frobenius norm [24]. The set of equations 
above shows that each single iteration of the algorithm over estimation and detection causes to 
monotonically reduce the residual function unless iterates converge. This monotonic reduction 
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of the descent function, its non-negativity and the fact that x q belongs to a finite set (codes 
of the codebook) and hence corresponding iterates of the estimation subproblem are also finite 
prove the convergence of this algorithm to the locally optimal solution in a finite number of 
steps. The globally optimal solution is achieved by having a good initial point which depends 
upon the training part as confirmed by our simulations. ■ 

B. Simplified Iterative Estimation and Detection 

This algorithm is very similar to the previous algorithm in essence but the difference arises at 
the detection step. The second step of the previous algorithm, the ML detection of the quantized 
code from the codebook, is computationally quite onerous, especially for codebooks with large 
cardinality. So we replace this enumeration step with least squares detection followed by mapping 
on the codebook. So the Step 2 of the previous algorithm gets replaced by two sub-steps. 
Step 2-A) At iteration i, do LS detection of the quantized feedback assuming 1 as the 
perfectly known channel 

xLs = (hj^fir^fif-x J=- (2D 

Step 2-B) Do hard detection on the constellation symbols which will map the LS channel 
estimate to the nearest code in the codebook. 

x q = HardDetection(x^ s ) (22) 

This helps to significantly reduce the computational complexity. Later results show that this 
does not involve any discernible performance degradation. 

C. Single-Shot Estimation and Detection 

This is the simplest and the fastest algorithm for the joint estimation and detection problem 
where the channel estimation and the feedback detection are performed (separately) only once. 
Step 1) Channel estimation based only upon the pilots 

h a = argmin ||Y a - v^hxall 2 . (23) 

h 

We can employ either the LS or the MMSE estimation technique. 

Step 2) Detection of the feedback x q assuming channel h a is perfectly known. This detection 
problem can be solved either by enumerating all the codewords like the first algorithm or by 
simple LS like the second algorithm or even by applying MMSE filter. 
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VI. Outage Based Training and Feedback Partitioning 
A. Definitions and Initial Setup 

The solution for the optimal CSIT estimate, h in eq. (fl2l) . requires joint estimation and 
detection. Furthermore, the fixed resource (Tfb channel uses) needs to be optimally split between 
training and feedback. Even if, as a simplification, we focus separately on training based estimate 
h a (given in eq. ©) and digital feedback based estimate h q (given in eq. CCD)), two questions 
arise: i) how the fixed CSIT acquisition interval Tfb should be split between training and 
feedback?, and ii) how the two estimates should be combined to get the final estimate? 

We use the minimization of the mean-square error (MSE) of the final CSIT (defined below) as 
the criterion for the optimal resource split, thus answering the first question for which we give the 
proper framework in the next subsection. It has been shown in [9] that the principal factor in the 
DL sum rate loss due to imperfect CSI is the MSE of CSIT. Hence the minimization of the MSE 
of CSIT is equivalent to the maximization of the system wide sum rate, the most commonly 
adopted performance metric. Furthermore, we propose to use the quantized feedback based 
estimate h q as the final CSIT estimate h due to better channel diversity exploitation properties 
of digital transmission as an answer to the second question. It may give the impression that the 
training based estimate h a goes wasted but in reality quantized feedback x q , which provides h q , 
is decoded based upon this training based estimate h a . 

This optimization framework consists of first providing a training based estimate h a to the BS 
in the training interval of T a channel uses. In the second interval of T q channel uses, the user 
sends the quantized version of its unit-norm channel direction information (CDI) vector which 
we assume to be perfectly known at the user. As the channel stays constant for each acquisition 
interval, this feedback transmission is equivalent to the transmission over slow fading channels 
for which deep channel fades (causing outage) are the typical error events [25]. We define the 
"outage" as an event when the channel realization and the quality of the training based estimate 
h a (a function of T a ) don't allow the BS to successfully decode the feedback information. Let 
e(T , b) be the outage probability when transmitting b bits per channel use on the UL feedback 
channel. Thus b is the e(T a , fe)-outage rate of the UL channel [25]. So the user can send a total 
of B = bT q feedback bits at e(T a , b) outage. Although the constellations used in practice have 2 b 
points where b must be a positive integer, for the time being we relax this restriction and allow 
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positive real values for b. 

We define the squared CDI error as the sine squared of the angle (9) between the true channel 
direction vector h and the BS estimated direction vector h, denoted as cr 2 (h, h). 

a 2 (h, h) = sm 2 {0) = 1 - cos 2 (#) = 1 - l^h] 2 (24) 

Further the MSE of CSIT is defined to be the expected value of the squared CDI error at the 
transmitter and denoted as a 2 . Although it's a slight abuse of notation but it has been shown 
that the CDI plays a vital role both for single-user and multi-user scenarios [9]. 

For the quantization of M-dimensional unit-norm CDI at the user, we employ random vector 
quantization (RVQ). For RVQ, the exact expression for the mean-square quantization error a 2 
has been given in [26], [9] as 

^ 2B 4 B m^t)' <25) 

where B is the total number of feedback bits (i.e. the codebook consists of 2 B codes) and (3 
represents the beta function which is defined in terms of the Gamma function as (3 (a, b) = y^h^ • 
However it turns out that a simple and tight upper bound given in reference [9] suffices: 

cr 2 <2^. (26) 

B. Optimal Resource Split between Training and Quantized Feedback 

Theorem 2 (The minimization of the MSE of CSIT): Under the training and feedback com- 
bining strategy, the MSE of CSIT a 2 is minimized as a result of the following optimization 
governing the fixed resource (T/ b ) split between the training T a and the quantized feedback 
interval T q and the outage rate b: 



a 2 * = rain 

T a ,b 

The constraints for this minimization are: 



-HT fb -T a ) 

2 M-x +e(T a ,b) 



(27) 



1 < T a < T fb and < b (28) 

The outage probability in the feedback interval e(T a , b) and the outage rate b are linked by the 
relation: 

/ P 2 T_ . \ 

(29) 



b = log ( 1 + — -F-He(T a , b)) \ , 
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where P is the user's power constraint and F" 1 ^) is the inverse of the standard cumulative 
distribution function (CDF) of xIm distributed variable. 

Proof: The proof consists of two parts. First we show the argument of minimization to be 
an upper bound on the MSE of CSIT and in the second part, the relation between e(T a , b) and 
b is derived. 

Upper bound on the MSE of CSIT: During the feedback phase, when the channel is not in 
outage and the BS is able to decode the feedback correctly, there is only quantization error in the 
final CSIT estimate. On the other hand, when the channel is in outage (happens with probability 
e(T , &)), the BS cannot decode the feedback information. Hence the MSE of CSIT a 2 can be 
written as 

a 2 = (l-e(T a ,b))a 2 + e(T a ,b)Eal^(h,h) 

< (l-e(T a ,b))a 2 + e(T a ,b) 

< a 2 q + e(T a ,b), (30) 

where a 2 is the mean- square quantization error and <rj! g(h, h) represents the MSE of CSIT 
when the channel is in outage (which means a feedback error occurs). The first inequality is 
obtained as Eer? *(h, h) is upper-bounded by 1. Putting the value of a 2 from eq. (|26l) using 
B = bT q and T fb = T a + T q in eq. (El, we get the desired upper bound of the MSE of CSIT as 

-b(T fb -Tg) 

o 2 <2 *f=i +e(T a ,b), (31) 

which concludes the first part of our proof. 

Significance of the MSE bound: The MSE bound of the CSIT eq. (EB is the desired 
performance metric. Its minimization gives us the optimal values for T a , T q and b (the number 
of feedback bits per channel use - this parameter governs the constellation size and hence the 
quantization error) for a fixed resource T/&. This bound shows us the basic trade-off involved. 
If the total number of feedback bits B = bT q is made large (either by choosing a large rate b 
per channel use in the feedback channel or by making T q large), it will allow the user to select 
a larger codebook (with 2 B codewords) and hence the quantization error will be negligible. But 
this strategy will plague the final CSIT estimation error by introducing a lot of outage (due to 
large b or poor channel estimate h a caused by small T a = Tf b — T q ). On the other hand for a 
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small number of total feedback bits B, the degradation due to outage probability will fade away, 
but there will be fewer codewords in the codebook and hence a large quantization error. 

The relation of b and e(T a ,b): Pilot sequence transmission from the user to the BS for an 
interval of length T a , given in eq. ([8]), can be equivalently written in a simplified form as 



y a = VPT a h + n a , (32) 

where P is the user's power constraint and y a , h, n a are the received signal, the channel vector 
and the noise respectively, all column vectors of dimension M. The BS can make MMSE estimate 
h a of the channel h as 

K = p^n ya - (33) 

As the i.i.d. channel entries are standard Gaussian, the MMSE estimation error h a = h — h a 
has also Gaussian i.i.d. entries as h a ~ CM (0, ct^Im) and the MSE per channel coefficient o\ 
is given by 

" l = ptT+T <34) 

Similarly the estimate h a has Gaussian i.i.d. entries and is distributed as h a ~ CM ^0, pP^[ ImJ ■ 
Now we focus our attention on the quantized feedback interval of the CSIT acquisition, given 
in eq. (fTOl) . The signal received during one symbol interval of this phase is given by 



y q = V P hx q + n q , (35) 

where x q represents the scalar feedback symbol transmitted by the user and y q , h, n q are M- 
dimensional column vectors representing respectively the observed signal, the channel and the 
noise for this particular symbol interval. To decode this information, the BS uses the estimate 
h a that it developed during the training phase. So the above equation can be written as 

y q = \fP h a x q + VP h a x q + n q . (36) 

The average effective signal-to-noise-ratio (denoted as SNR e fj) at the BS during the feedback 
interval relegating the imperfect channel estimate portion of the signal into noise and treating 
h a as the perfectly known channel is given by: 

PWh II 2 

SNR eff = "7' (37) 
Pot + 1 
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Plugging in the value of a a from eq. (|34l) . SNR efT will become 



Pllh 



|2 



SNR eff = dM - . (38) 



+ 1 



PT a +l 

We can do a small change of variable as 2( -' f ^ +1 ^ | |h a | | 2 represents a standard chi-square random 
variable having 2M degrees of freedom (DOF), denoted as xIm- So the SNR efr becomes 

SNR =» = 2(P?p T T a+l / »- (?9) 
The outage probability e(T a , b) during this feedback interval corresponding to the outage rate 
b bits per channel use can be written as 



e(T a ,b) = P [log (1 + SNR efr ) < b] 



P 



l °^ 1 + 2(P + PT a + l) xlM ^- b 



(40) 



where P denotes the probability of an event. This relation can be inverted to obtain the outage 

rate b corresponding to the outage probability e(T a ,b), as given below 

f P 2 T \ 

b = log 1 + -F~\e(T a , b)) , (41) 

& V 2(P + PT a + 1) v v ' ' V 

where -F _1 (.) is the inverse of the CDF of x!m distributed variable. This concludes the proof. 

The analytical solution to the minimization in Theorem [2] does not bear closed form expression 

but its numerical optimization is quite trivial. ■ 

VII. Optimization Setup With Practical Constellations 

In the previous optimization procedure, we had relaxed the restriction of practical constellations 
and allowed any positive real values for the outage rate b bits per channel use. But this is not 
true for the practical communication systems as the constellations used always have number 
of points equal to an integer power of 2, i.e., b can only take an integer value. We propose 
two simple strategies in the following sub-sections to handle this issue which arises due to this 
limitation of practical constellations. 
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A. Resource Split Optimization for a Fixed Constellation 

We can optimize the MSE of CSIT for a fixed constellation, i.e. for a fixed outage rate 
b. In this case, the outage rate based optimization setup, built in the previous section, remains 
operational except that b is no more an optimization variable but a fixed parameter corresponding 
to the chosen constellation. Thus b will assume the values of 2 and 4 for QPSK and 16-QAM, 
respectively, although any other constellation can be chosen. The minimization of the MSE of 
CSIT will give the optimal resource split tailored for the particular constellation chosen. Hence 
the objective function for a fixed constellation (fixed value of b) becomes: 

-b(T fb -T a ) 



mm 



+ e(T a} b) 



(42) 



where Tf b = T a + T q and b are fixed, and b and e(T a ,b) are related as in Theorem [2] The 
constraint for this minimization is: 

1 < T a < T fb (43) 

This minimization gives us the optimal value of training length T a which should be used to get 
the minimum MSE of CSIT for this particular constellation (fixed b) under fixed values of M, 
P and Tf b . This restriction of fixed constellation brings in some limitations. For example, the 
use of smaller constellation like QSPK at very high SNR will not be beneficial as CSIT error 
will stay bounded due to the fixed cardinality of the codebook (hence quantization error will be 
non-diminishing as a function of SNR) even for asymptotically large values of SNR. 



B. Using Real Values of b with Extra Parity Bits 

The other way to resolve the issue of discrete practical constellations is through the use 
of channel coding. This allows us to use positive real values for b, obtained from the original 
optimization setup. The only restriction, we impose, is that B should take an integer value which 
can be obtained by using ceiling or floor operation on the product bT q . Now this B governs the 
cardinality of the codebook. The actual constellation, which is used to send feedback, is the one 
larger than that dictated by b, among the available constellations. Let the rate of that constellation 
be denoted by b c . Hence the number of total bits, which will be sent in the feedback phase, 
is B c = b c T q where B c > B as b c > b. All the extra bits B c — B in the feedback phase are 
used as parity bits. So one can employ either linear block codes or convolutional codes with 
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an appropriate rate so as to convert B information (true channel feedback) bits into B c coded 
bits. One advantage of using convolutional codes is that puncturing can give more flexibility 
for rate matching. Now these B c bits are sent in the digital feedback phase. As the outage rate 
b is less than the rate b c of the constellation chosen, the use of larger constellation will give 
rise to increase in the number of erroneous coded bits. The number of errors will grow large 
in direct proportion to the difference B c — B. On the other hand, all the extra feedback bits 
B c — B are the parity bits and when decoding will be performed at the BS, the capability of this 
coding/decoding operation to combat the channel errors (introduced in the quantized feedback) 
is also proportional to this difference, hence compensating the negative impact of using larger 
constellation. 

VIII. Simulation Results 

Our simulation environment consists of a BS with M = 4 antennas and a single user with a 
single antenna. The channel model is the same as described in Section HB The feedback interval 
Tf b is fixed to 20 channel uses for all simulations. 

A. Optimization Results for Continuous Constellations 

First we present the results when the outage rate b is not constrained to be an integer and 
can assume any positive real value. The optimization of the objective function, given in section 
fVTl gives us the values for the optimal training length T a and the optimal outage rate b for 
various values of user's power constraint, which is equal to the UL SNR as the noise at every 
BS antenna has been normalized to have unit variance. Knowing the values of e(T a , b) and T q , 
computed based upon the optimal values of T a and b, allows us to compute the upper bound of the 
final CSIT error eq. ([3Tb . These values have been plotted in dB scale in Fig. |2] For comparison 
purpose, we have also plotted the MSE of CSIT with classical training based estimation. This plot 
clearly shows the interest for our hybrid two-staged CSIT acquisition strategy as, from medium 
to large SNR values, CSIT error incurred by this scheme is much less than the error obtained 
by training based only CSIT acquisition. Only at very low SNR values, this two stage scheme 
performs worse than the classical training scheme. This happens because we have restricted our 
final estimate to come from the digital feedback. Here the total feedback resource (SNR and 
Tfb) does not allow transmission of sufficient number of bits through the channel so quantization 
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Fig. 2. Mean-Square CSIT Errors: Tft = 20 and M = 4. The novel hybrid scheme performs much better than the classical 
training based CSIT acquisition. Gains are significant even with naive use of practical constellations without any coding. 



error is quite large. This gets aggravated due to the poor training based estimate based upon 
which these bits are decoded, further degrading the performance. This degradation can be easily 
avoided by selecting an SNR threshold below which traditional training based scheme should 
be employed. 

To see the optimal split between training and quantized feedback, we have plotted the optimal 
values of training length T a , corresponding values of quantized feedback interval T q and the 
optimal outage rate b in Fig. ©. 

B. Optimization Results for Discrete Constellations 

In this section, we present simulation results when fixed constellations QPSK and 16-QAM are 
used for quantized feedback transmission. So the outage rate b becomes fixed corresponding to 
the fixed constellation (2 for QPSK and 4 for 16-QAM) and the optimization is carried only over 
the resource split between training and quantized feedback as described in I VII- A I The curves 
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Fig. 3. Optimal Lengths and Outage Rate: Tfb = 20 and M = 4. With increase in SNR, both the length of the quantized 
feedback interval T q and the outage rate b increase gradually. 



for the MSE of CSIT obtained theoretically, by doing the simulations with actual constellations 
and the corresponding quantization bound for that constellation have been plotted in Fig. HI 
Quantization bound gives the quantization error when maximal (Tfb — 1) symbols are used for 
quantized feedback part. Hence, it gives the lower bound on the MSE of CSIT (performance 
upper bound) for that particular constellation. For comparison purpose, we have also plotted 
the MSE of CSIT for classical training scheme. This figure shows that from low to medium 
SNR values, the novel scheme with QPSK gives CSIT error below that of the classical training 
approach but 16-QAM is not attractive in this range due to many incorrect detection events. At 
high SNR values, hybrid scheme with QPSK suffers from performance degradation due to its 
bounded quantization error but 16-QAM behaves much better than the classical scheme. At very 
high values of SNR, even the 16-QAM will show bounded performance for the same reason that 
its rate does not increase with SNR but then one needs to switch to further larger constellations. 
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Fig. 4. Mean-Square CSIT Errors: T fb = 20 and M = 4 (a) QPSK and (b) 16-QAM. The novel hybrid scheme with QPSK 
performs better than the classical one from 9 to 25 dB of SNR, but 16-QAM outperforms both after 21 dB. 
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In Fig. ©, both for QPSK and 16-QAM, we have plotted the MSE of CSIT using our proposed 
iterative estimation and detection algorithms from section |V] A surprising fact about the two 
proposed iterative algorithms is their similar performance. One would expect the iterative esti- 
mation and detection algorithm (with ML detection) to perform much better than the simplified 
iterative estimation and detection algorithm (which uses the simple LS detection), but extensive 
simulations show that the performance difference between the two algorithms is negligible. In all 
our simulations, both algorithms show very rapid convergence and they were always converging 
in second or third iteration. There were extremely rare instances (less than one in ten million) 
when convergence was not achieved in three iterations. 

We don't plot the optimal training and quantized feedback interval lengths out of space 
limitation but they show the same behavior as displayed in Fig. ([3]), i.e., the optimal quantized 
feedback interval gets larger with the increase in SNR for both constellations. 

C. Discrete Constellations and Coding 

Now we plot the results of the MSE of CSIT when quantized feedback is sent using discrete 
constellations and the rate matching is performed using convolutional codes as explained in 
section IVII-BI The code rates and the puncturing patterns need to be selected carefully. First of 
all, convolutional codes of all desired rates are not available. Secondly, although puncturing can 
help a lot to reach to the desired rate still it needs to be selected carefully as random choice of 
puncturing pattern may destroy the code structure and hence ultimately its performance. 

We plot the results obtained using three different codes (1/2 rate code, 2/3 rate code and 
3/4 rate code) in Fig. ©. All of these codes have been used with 16-QAM (4 bits per channel 
use). Hence the number of actual information (feedback) bits are 2, 2.67 and 3 per channel use 
for 1/2, 2/3 and 3/4 rate code respectively. For comparison purpose, the plot shows the MSE 
of CSIT obtained by using QPSK and 16-QAM constellations without any coding and through 
classical training scheme. 

For 1/2 rate code, the generator matrix is [171 133]s and trace back length is 30. It performs 
better than classical training from 16 to 23 dB of SNR but QPSK without any coding performs 
better than this curve. For 2/3 rate code, the generator matrix is [4 5 17; 7 4 2] 8 with trace back 
length of 20. From 17 dB onward, it performs better than classical training. It performs even 
better than 16-QAM (without coding) before 24 dB of SNR. For 3/4 rate code, we use the 1/2 
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Fig. 5. Mean-Square CSIT Errors with Convolutional Coding: Tft, = 20 and M = 4. At certain SNR intervals, coding strategy 
performs better than no coding optimal resource split outcome. 



rate base code (same as before) and use the puncturing pattern of [111001] to get the final rate 
of 3/4. 

D. Imperfect CSIR Analysis 

All the previous results have been obtained working under the assumption of perfect CSIR 
which is certainly too good to be true. Here we remove this perfect CSIR assumption and analyze 
how the MSE of CSIT with novel scheme behaves with imperfect CSIR. 

The curves, when quantized feedback is transmitted using QPSK and 16-QAM, have been 
plotted in Fig. ©. We have plotted these curves under two scenarios. First, when the CSIR 
quality varies and improves with the increase in UL SNR which is quite logical as, due to 
reciprocity, the link quality improves in both directions and the BS can surely pump more power 
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Fig. 6. Mean-Square CSIT Errors with Imperfect CSIR: T fb = 20 and M = 4 (a) QPSK and (b) 16-QAM. For an imperfect 
CSIR of reasonable quality, the novel scheme performs much better than the classical scheme and the performance approaches 
to the perfect CSIR case for a good enough CSIR. 
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as compared to a small hand-held mobile unit. For this case, we take the MSE of CSIR 30 dB 
less than the classical training only CSIT curve. The second scenario is when CSIR quality is 
held fixed independent of the UL SNR. For this, we plot the MSE of CSIT when the MSE 
of CSIR is kept fixed at —40, —50 and —60 dB. We believe this scenario to be of relatively 
less importance. We remark that when CSIR quality improves with UL SNR, hybrid approach 
performs very close to the perfect CSIR curve. For the other case when CSIR quality is kept 
fixed, it may become the performance limit of the MSE of CSIT (if not of proper quality). 

IX. Concluding Remarks 

Traditional CSIT acquisition in reciprocal systems relying exclusively on the use of training 
sequences ignores the shared knowledge of an identical channel between the BS and the user. 
We presented a novel approach of CSIT acquisition at the BS for the DL transmission in a 
reciprocal MIMO communication system combining the use of a training sequence together 
with quantized channel feedback. We characterized the optimal CSIT acquisition setup and 
proposed two iterative algorithms for the resulting joint estimation and detection problem and 
provided a convergence proof. The novel outage-rate based approach allows the optimal resource 
partitioning between the training and the quantized feedback. We proposed two strategies to 
overcome the limitation of practical constellation availability with integer number of bits per 
channel use either by optimizing the resource split for a particular constellation or by the use of 
channel coding for rate matching. The novel combining scheme shows superior performance due 
to better exploitation of the reciprocity principle and the trade-off between the CSIT quality and 
the resource utilization improves significantly. It is further shown that with an imperfect CSIR 
of reasonable quality, performance gains comparable to the perfect CSIR case are achievable. 
Multi-User Extension: The proposed novel scheme holds verbatim in the case of multiple users. 
In the first phase of "pure training", the users should use orthogonal training signals so that the 
BS gets an initial estimate of the channel. Then during the second "quantized feedback" phase, 
the UL channel should be used as MIMO-MAC. The optimization of resources remains however 
an open problem in this setting. In this scenario, the resource optimization will depend heavily 
upon the BS transmission strategy, e.g., the optimal resource split could be extremely different 
for TDMA or SDMA. The presence of more users in the system, larger than the BS transmit 
antennas, and subsequently required user scheduling would add an extra twist to this problem. 
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Users with Multiple Antennas: There are different ways to treat the fully general case of 
multiple users with multiple antennas where even a single user can be transmitted multiple 
streams. It adds an extra level of complexity to the open problem of multiple single-antenna users. 
For the users with multiple antennas, a simplifying strategy could be to do antenna combining as 
in [27] to minimize the quantization error. This scheme is promising as it reduces the feedback 
requirement by converting the MIMO channel into a vector channel and in a direction of minimal 
quantization error. Hence effectively it will become the multiple single- antenna user extension 
of our work. 
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